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A COMPARISON BETWEEN SOME METHODS FOR COMPUTING 
OPTIMUM PATHS IN THE PROBLEM OF BOLZA 

Frank D. Faulkner, Department of Mathematics and 

Mechanics, U. S. Naval Postgraduate School, 

Monterey , Ca lif ornia . 

A comparison is made between methods for calculating 
optimum trajectories in the problem of Bolza. For the most 
part the comparison is between the method of steepest ascent 
as given by Bryson and Denham ( 1 ) and a direct method due to 
the author, using the differential techniques which G. A. Bliss 
developed during World War I in ballistics. A second purpose 
of the paper is to give a modification of Bryson and Denham 1 s 
method which eliminates one of the minor subproblems and 
makes it possible to use a trajectory obtained by their method 
to start the routine for the other method. The reason for 
this will be given. 

Both methods seem to yield reasonably satisfactory 
results and both have minor problems associated with actual 
computation, the most recurrent being that of convergence. 

Both methods use the adjoint system of differential 
equations much as used by Bliss; the adjoint system seems 
imperative for the solution and understanding of all but the 
simplest problems. 

Bryson and Denham's method, henceforth called the BFL 
method is based on the constructive proof of the fundamental 
lemma of the calculus of variations. This shows how, if an 
admissible path is given on which the Euler equations are not 
satisfied, a better admissible path can be obtained. A modi- 
fication has been made up to eliminate the trouble of integra- 
ting forward and then backward. In the other method only 
extremals are used. Usually Bliss's differential methods are 
used to set up a Newton iteration for the constants, though 
they may also be obtained directly (see (2)). This method 
will be called the FD method. ^ 

The comparisons so far suggest the following conclusions. 
There are some advantages of the FD method. (1) It converges 
more rapidly, if it converges. It is a Newton iteration for 
the roots to a set of equations and tends to converge exponen- 
tially once the solution is approached. (2) There is no 
question of how closely the path approaches an extremal, since 
only extremals are used. This is often not too important from 
a practical point of view, since trajectories "near" the 
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optimum are generally "almost” as good. (3) Discontinuous or 
even unbounded control and constraints on the variables seem 
to pose no major problem. (4) The optimum correction to a 
perturbed optimum trajectory involves only a few more itera- 
tions. (5) Various identities and relations associated with 
extremals cafy be used; for example, if the independent vari- 
able does nott enter explicitly, the Hamiltonian is constant. 

(6) There is no arbitrary step size in the solution; they are 
determined by the Newton-Raphson routine. (7) The method seems 
more flexible in treating unusual problems. Solutions were 
obtained for the banfe-bang control problem (3), for problems 
with discontinuous bounded control and further limitations on 
the energy in rocketry (4), (5 ) , (6 ) , and for problems with 
impulse solutions; simple allocation problemshave also been 
treated by this method. 

There are several advantages to the BFL method. (1) It 
seems to require less skill or luck in getting started. 

(2) The integral relations for the corrections are computa- 
tionally more stable. They involve only first derivatives 
whereas the other method involves second derivatives with 
respect to the control variables. (3) It seems easier to get 
a convergent sequence of paths without interrupting the 
computation. Another FL method (that is, one based on the 
fundamental lemma) is due to Kelley (^7); it was not investi- 
gated at length. It had the desirable feature of simplicity 
but it converged more slowly, perhaps due to inability of the 
author to select a sequence of parameters well. 

The methods also have their problems and difficulties. 

In the FD method the parameters are the constants of integra- 
tion associated with the adjoint system of differential equa- 
tions . The programmer may have no idea as to the proper 
range of these. The other method requires only a likely 
initial path; while even this may be hard to find, the uncer- 
tainty seems to be an order less. Usually the FD method must 
be modified if the strong Legendre condition is not satisfied. 
The methods based on the fundamental lemma are computationally 
more stable; in the direct method the control variable is ob- 
tained every step from equations which involve second deriva- 
tives in the denominator. In the FL methods both forward and 
backward integration are usually used. This, however, is only 
a matter of programming; a routine is given which eliminates 
this. Another problem associated with FL methods is that 
there is no obvious metric which indicates how much the path 
differs from an extremal. Finally, no way is seen to extend 
FL methods to problems involving discontinuous or bounded 
control generally. 

Since FL methods are more likely to converge, but tend 
to converge more slowly and it is not clear when convergence 
i is -attained, it is desirable to get a "good" path using the 
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FL method, then use it to obtain a set of initial values for 
the Lagrange multipliers to start a differential rqutine 
wherein only extremals are used. A procedure is given for 
this, together with a measure of the variation of any curve 
from an extremal. Tlje measure has no obvious significance 
except that ^t is zero whenever the curve is an extremal and 
is positive otherwise. 

1. Basic Equations 

The equations which govern the system are assumed to have 
the form 

x 1 « f 1 , i » 1> * * • >n, (1.1) 

where f* 23 f*(x,p,t), x = (x^) is an n-dimensional vector 
P 3 (p a ) is an m-dimensional vector, and t is a scalar 
which we may think of as time. The set (x 1 ) are called state 
variables and the set (p a ) are called control variables. The 
p's are to be chosen as functions of time to effect whatever 
optimum is desired. Vectors may be written in index notation 
x , in vector notation x or in matrix notation (x) or 
(xi), whichever suggest the important properties. Partial 
derivatives will be denoted by subscripts when no error seems 
likely; for example f^ « Sfi/BxJ and f* - dfVdp a ; Latin 
i indices will have the J range 1-n and Greek indices will have 
the range 1-m. It will be assumed that whatever derivatives 
occur are continuous . 

Some familiarity with calculus of variations is assumed. 

A typical problem is that of going from one specified point to 
another, from (xq) to (X) or (Xrp) in such a way that the 
terminal value T of the time is a minimum. The form of the 
equations involved is the same, independent of the particular 
problem. The variational equations are 

6x* * fj6x^ 4* f*6p Q (1.2) 

(the summation signs will be omitted if no error seems likely), 
and the associated adjoint system of equations for the Lagrange 
multipliers is 

* » - . (1.3) 

The function 

H= X i f i =x:-f (1.4) 

is called the hamiltonian. The Euler equations 

H a - 0 - x i4 

are necessary for an optimum if p lies in an open region. 

An admissible path is one which begins and ends at the 
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specified initial and terminal points. The fundamental lemma 
states that if we have an admissible path on which Eq. (1.5) 
is not satisfied, then we can construct a better admissible 
path. The principal difficulty is in matching end conditions, 
in solving a two-, or multi-point boundary problem. 

We sha'll need two fundamental sets for the adjoint 
system. In the first one, the initial conditions are given: 
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as we see 



(1.6) 



There is also the Green's formula for the variations of 
the end values of the state variables, as given by Bliss (8), 
r T 

SxV) - X 1 -! 6p°dt (1.7) 

J o 

as a consequence of Eqs. (1.2) and (1.3). The corresponding 
differentials are 

dx 1 (T) « Sx^T) + x i (T)dT. (1.8) 

Finally, let us consider a special set of variations ; 
let _ 

6p * S c^ 1 *^ . (1.9) 

Then 

^(T) - c X j -f a X^f a dt (1.10) 

J 0 J 

which has the form 



fix 1 » A*^c ; (1.11) 

^ i 1 

The matrix A with elements A , 

T 

a ij - r^.fiJ.fdt (i.i2) 

^0 

is a positive semi-definite (at least) symmetric matrix, 
since it is a gramian or a sum of gramian matrices. 

Since we will generally integrate forward it will be 
convenient to have this in terms of X^ . If we define the 
matrix a with elements 
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A^j « A kl L ki L 1 j (or a « L*AL) . (1.14) 

We shall be concerned almost entirely with problems wherein 
A and a have rank n except when the path is an extremal, 
and on extremals they have rank n-1. An extremal is a curve 
.whereon Eq. (1.5) is satisfied. Implicit in the above 
relations is that there are no corners. 

2. Method of the Fundamental Lemma 

In this section a variant of the BF method is given which 
involves only forward integration. The essential of this 
method is that there are two sets of corrections or variations 
generated. The first set drives the path toward admissibility 
and the second is orthogonal to these and drives the path 
toward an extremal. 

Let us consider the minimum time problem as a typical 
problem. The initial point (xl) and the terminal point (X 1 ) 
are specified and we wish to minimize the terminal value T of 
t. Let us choose a likely path by choosing or generating in 
some fashion the control variable. Let us compute the trajec- 
tory, storing the control variable and a fundamental set of 
solutions for the adjoint system. Let us calculate also. the 
integrals A^j . We can invert the matrix L and get A*- j . 

If we consider the special variations of Eq. (1.9), then by 
Eqs. (1.7), (1.8), and (1.12) we get 

dx 1 (T) = A iJ Cj + x i (T)dT. (2.1) 

If we treat differentials as differences, we get 

x 1 - x*(T) - A^Cj + ^OOAT. (2.2) 

This is a set of n equatons for n+1 unknowns c ,•••, 

V AT. 1 

Let us now choose a set of c r s to drive the path to- 
ward admissibility. Let us examine the n matrices obtained 
from the n by n+1 matrix (A^i,x^) of Eq. (2.2) by omitting 
the first n columns one at a time. Let us choose the one of 
these which has the largest determinant. Assume for definite- 
ness that we omit the n f th column* Let us then solve 



X 1 - x^T) 



A^c^ + A i ^C2^ +• 



•+ A 1 nml c 
+ x i AT . 



n-1 1 



(2.3) 
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If we choose c „ 4 0 then this set of variations tends to 
nl 

make the resulting path admissible. Now let us choose a 
second set of variations with c^ * e, satisfying the 
equations .. . 

0 = A 1 ^Cj2 + XAT 2 . (2.4) 

The first-order effects of this do not change the admissibility 
of the path. The set c ^2 ,C 22 ’ * * ,C n-l 2*^2 are ^^ near 
e. If AT 2 /e is positive, choose e negative and vice versa. 

A minor problem is to choose e well. If it is chosen too 
large, the corrections overshoot and if it is too small, the 
sequence creeps toward the desired extremal. 

This routine has the advantage that it avoids the alter- 
nate forward and backward integration of the original method. 
This is at the expense of computer storage. It also avoids 
a term that to the author was puzzling, the quantity in the 
radical for choosing the step size. The indeterminancy in 
choosing the step size is inherent in the method unless the 
operator has some feel for the problem. 

Storage of the adjoint variables, for correcting the 
control variables may be eliminated by using the values of the 
c's found in one iteration with the solutions to the adjoint 
generated in the next. 

i 

3. Differential Method 



In this method only extremals are used. The Euler 
equation, Eq. (1.5) will be assumed to be satisfied identically 
on every path considered. Implicitly then, the control vari- 
able maximizes H at an interior point. The problem is that 
the maximization occurs for some unknown solution 

X = C\ (3.1) 

to the adjoint. Now suppose that we have been able to get an 
estimate of the C's for the desired extremal. It happens 
that we may choose one C, say C^, arbitrarily, since the 

essential equations are homogeneous in the X's; we must only 
be certain that the choice has the right sign. 

Now let':us consider the effects of changing the C's. 

,Let us change the C's and the control variables, assuming 
that we can do this leaving x,t fixed. From Eqs . (1.5) and 
(3.1) we get 

VV ci + ^*^aT 6pT " °» ( 3 * 2 ) 

a term involving 6x has been omitted from these equations , 
under the assumption that x,t could be held fixed. We can 
solve this equation for 6p in terms of dC provided the m'th 
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order determinant with elements X*f . is not zero, and sub- 



dx i (T) 



OT 

stitute into Eqs. (1.7) and (1.8) to get n equatipns of the 

form , 

n-l . , 

E B*dC J + x 1 (T)dT ; (3.3) 

1 

the B. are integrals. This is the basic relation for setting 
up thS Newton-Raphson iteration. We run a trajectory, calcu- 
lating the integrals BjJ we treat differentials as differ- 
ences; we substitute X*- - x^-(T) for dxi and solve for ACi, 
AT. 



The principal problem is that of convergence. Various 
problems involving discontinuous control have been solved by 
this method. A term must be added in the differential for- 
mulas for each corner where the value of t is not specified, 
and corner conditions must be met. It seems to the author 
that if the path has corners the programmer must anticipate 
this, since the control variable is discontinuous. Indeed the 
difficulties in finding the roots of algebraic equations sug- 
gest that no method exists for checking the Weierstrass 
condition generally. 

4. Determining a 'Best-fitting' Extremal. 



The studies have indicated that it is easier to develope 
a fairly good path with FL methods, but it is not clear how 
nearly an extremal has been approached. The differential 
method uses onlyi extremals and tends to converge exponentially 
when it converges, but it is harder to get a satisfactory 
approximation for starting. This suggests that we might use an 
FL method initially, and then use it to determine a 'best- 
fitting' extremal, to continue with the differential method. 

The problem then becomes: how can we get a best-fitting 
extremal for the path just found? The answer lies in the 
method of correction for the FL method. Let us consider the 
matrix A defined by Eq. (1.12). As a gramian or the sum of 
gramians it is symmetric and positive definite, or at least 
semi -definite. Hence its eigenvalues are either positive or 
zero and its eigenvectors are orthogonal, or in case of equal 
eigenvalues can be chosen so. 

Now let us consider Eq. (2.1) for fixed T. We may 
write it 

(6x) t - A(c) (4.1) 

Generally A has rank n but for extremals it has rank n-l. 
Let us assume this to be the case in the following. 

Then, by Eq. (4.1) we can attain at time T any point 
in some neighborhood of the endpoint of a curve which is not 
an extremal, but the endpoint of extremals define an n-l disc 
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in n-space. We have seen that with an extremal is a solution 
to the adjoint X, say X*. If E* is the extremal, embedded 
in a family (E) of extremals starting from (xq), then for 
fixed T, X*(T) is normal to the disc defined by* the end- 
points of (E) . 

Now consider the corrective routine and the matrix A. 

As the successive iterates approach an extremal A becomes 
singular. Let 0 < 7^ < 72 < * * *< 7' n be the eigenvalues of 



A, and assume that .7^ ^ 7 2 » 

responding eigenvectors of A. 
attained the extremal, so that 



Let X , 






be the cor- 



'Al* ’"‘An 
It is clear that if we had 
A were singular, then 



r *< T > |l ^A1 ' 



( 4 . 3 ) 



But X* is only determined to within a multiplicative factor. 
Hence X*(T) is determined as the eigenvector corresponding 
to the zero eigenvalue of A. We may get X*( 0 ) from the 
matrix L defined in the first section. Most computing cen- 
ters have routines for finding the eigenvalues and eigenvectors 
of a symmetric matrix. 

This suggest a measure of approximation to an extremal, 
either 7j/7o or 7]/7 » the ratio of the smallest eigen- 
value to either the next: smallest or to the largest. Unfor- 
tunately it seems to have just one important property; it 
becomes zero when an extremal is attained. 

The same procedure works equally well on a. When the 
curve approaches an extremal, the eigenvector corresponding to 
the zero eigenvalue gives the starting values for the adjoint. 
If the path is not an extremal, there seems to be no simple 
relation among the eigenvalues and eigenvectors of A and a, 
Since the only property of the matrix L is that its 
determinant is positive. 

I 

5 . Comments 

Considerable computation has been done in an attempt to 
compare the two methods, without a definite conclusion. Seibel 
(£) made comparisons mainly in the ship-routing problem wherein 
convergence is not a serious problem. Usually the time 
required for determining a course by the FD method was about 
two-fifths or half__that for the BFL method: 




Considerable work was done ini* trying to compare the two 
methods (10 ) on the problem of maximum range for a high-speed 
glider. The problem is on the verge of instability in two 
ways. First, a slight increase in initial velocity puts it 
virtually in orbit. Second, the range is apparently extremely 
sensitive to the value used for the radius of the earth. An 



ess in Astronautics and Rocketry 



increase of about thirty NM (nautical niles) in the value used 
caused an increase in range from about 39,000 NM to about 
54,000 NM. These are reflected in very large values for the 
Lagrange multipliers, one of them being about 10 1 times as 
large as another initially. Discussions suggest that this 
might be overcome by the adoption of a double-precision 
routine like that used by Johnson (11) in a related problem. 

The latest computations actually carried out gave a much better 
trajectory by the BFL method in the last moments of the tra- 
jectory. It does not seem possible, even knowing the initial 
values of the Lagrange multipliers very closely, to compute 
the complete trajectory for the desired extremal; it always 
seems to end in a loop. The ratio of the Lagrange multipliers 
which are so large initially determines the angle of attack. 

In the last moment these must both be driven to zero. This 
could not be done. It might be pointed out that this problem 
is very stable in another sense. It seems possible to get 
good open-loop programs and very good closed-loop programs, 
these leading to ranges quite near the maximum. Different 
models were used for the atmosphere, but a few computations 
suggested that this was not a significant factor. 

The FD method differs from the so-called method of 
second variations given by Breakwell, Speyer, and Bryson (12) 
in that the variations of the state variables were dropped 
in Eq. (3.2). It seems to the author that the name second- 
variation given to that method is a misnomer. The coefficients 
in the linear forms , of Eq. (3.2) are the same as the coef- 
ficients in the Legendre quadratic form but they are not 
second variations. 

The important thing is that we can calculate differen- 
tials associated with extremals. This method has been applied 
to several problems by the author and his students where no 
terms appeared corresponding to the second derivatives. 

McCalla ( 3 ) in his thesis treated a low-order bang-bang 
problem. Faulkner and Ward (f>) treat the problem of ballistic 
missile interception where bounded control and limited energy 
expenditure were imposed. Professor Bleick and the author 
treat a launching problem where there is an intermediate 
period of coasting. In these there are corners at points 
which are unknown initially; no way is seen to extend the 
method of the fundamental lemma to these, since the proof 
assumes that 'two-sided 1 corrections to the control are al- 
lowed. Others have advocated the use of 'slack 1 variables 
in posing problems such as these. It is the author's strong 
opinion that if the problem can be posed without these, its 
solution is simpler without them. 

The author felt perhaps that some fo the convergence 
problems were due to the omission of the terms. This is not 
always the case, since the two methods coincide 
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when the equations are linear in the state variables and 
serious convergence problems were encountered in tpe problem 
of rendezvous in minimum time, with linearized equations (4). 

The FJ) method also applies to the solution of some 
allocation problems of the following type. An integral of 
the form , > 



is to be maximized with respect to p and minimized with 
respect to q, which are probability density functions (13) ♦ 

It is easily posed as a simple linear control problem involving 
a third order system with limited total input, being unusual 
in that the solution is of minimax type. 
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